Model Quantization, ONNX Runtime, Embedded Inference, TinyML

TinyML is the most impressive piece of software you can run on any ESP32
xda-developers.com·17h
🏗️AI Infrastructure
Flag this post
Beyond the Hype: The Hidden Economics of AI Inference
dev.to·6h·
Discuss: DEV
🏗️AI Infrastructure
Flag this post
From Lossy to Lossless Reasoning
manidoraisamy.com·9h·
Discuss: Hacker News
🧩Low-code
Flag this post
Raising the Bar on ML Model Deployment Safety
uber.com·1d
🏗️AI Infrastructure
Flag this post
Accelerating AI inferencing with external KV Cache on Managed Lustre
cloud.google.com·11h
🏗️AI Infrastructure
Flag this post
Custom Intelligence: Building AI that matches your business DNA
aws.amazon.com·11h
🏗️AI Infrastructure
Flag this post
MIT’s Survey On Accelerators and Processors for Inference, With Peak Performance And Power Comparisons
semiengineering.com·10h
🏗️AI Infrastructure
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
youtube.com·10h·
Discuss: Hacker News
🏗️AI Infrastructure
Flag this post
RL for Reasoning by Adaptively Revealing Rationales
machinelearning.apple.com·3d
🏗️AI Infrastructure
Flag this post
How Well Does RL Scale?
tobyord.com·1d·
Discuss: Hacker News
🏗️AI Infrastructure
Flag this post
Show HN: Everything it took to run an LLM at 10k tok/s on H200s
relace.ai·2d·
Discuss: Hacker News
🏗️AI Infrastructure
Flag this post
Daily Artificial Intelligence Digest - Oct 31, 2025
dev.to·1d·
Discuss: DEV
🏗️AI Infrastructure
Flag this post
Agentic AI: A Comprehensive Survey of Architectures, Applications, and Future Directions
arxiv.org·1d
🏗️AI Infrastructure
Flag this post
Context-Bench: Benchmarking LLMs on Agentic Context Engineering
letta.com·8h·
Discuss: Hacker News
🏗️AI Infrastructure
Flag this post
Building AI-Powered APIs in Minutes, Not Months
dev.to·22h·
Discuss: DEV
☁️Serverless Rust
Flag this post
Show HN: Hot or Slop – Visual Turing test on how well humans detect AI images
hotorslop.com·1d·
Discuss: Hacker News
🏗️AI Infrastructure
Flag this post
Your Transformer is Secretly an EOT Solver
elonlit.com·22h·
Discuss: Hacker News
⏱️Time-series Optimization
Flag this post
Where to Buy or Rent GPUs for LLM Inference: The 2026 GPU Procurement Guide
bentoml.com·13h·
Discuss: Hacker News
🏗️AI Infrastructure
Flag this post